;===============================================; ;Dual-Tile Encoding: ; ;NES/Famicom Implementation ; ;version 1.00 ; ;===============================================; ;Written by RedComet ; ;redcomet@rpgclassics.com ; ;http://www.rpgclassics.com/subsites/twit/ ; ;===============================================; ;===============================================; ;Table of Contents ; ;===============================================; ;-Version History ; ;-Introduction ; ;-What is DTE? ; ;-Finding the Text Routine ; ;-Implementing DTE in an NES game ; ;===============================================; ;=======================================; ;Version History ; ;=======================================; ;Version 1.00: ; ;Initial release. If there's demand, ; ;this will be expanded upon further. ; ;=======================================; ;=======================================; ;Introduction ; ;=======================================; Dual-Tile Encoding (commonly referred to as DTE in romhacking circles) has to be one of the easiest things you can do to a game to gain space, yet, like most assembly stuff, it goes all but undocumented! I'm hoping to rectify that with this document. The purpose of this document is to provide you with information and examples of what DTE is and how to successfully implement it in a NES/Famicom rom. The below method is how I do it, and it may work on other systems with some tweaking. In future updates, I hope to address any questions, problems, or suggestions readers might have. The point of this is to help you, so email me if it's not working at redcomet@rpgclassics.com Please be a dear and visit my site for this and (hopefully) more documents and information on various on-going video game translations at: http://www.rpgclassics.com/subsites/twit/ Before going any further, I want to let you know that basic romhacking (tables, strings, pointers and the like) and rudimentary assembly programming knowledge is assumed; if you're new to romhacking, this document isn't for you. As for the assembly part, just make sure you understand the concepts of low-level programming and, coupled with a few reference documents for the NES, you should be fine. Now, onto the story... ;=======================================; ;What is DTE? ; ;=======================================; DTE stands for Dual-Tile Encoding. This means that one hex value can represent two characters at the same time. Consider the following example: Say you have a table as follows: 0A=a 0B=b 0C=c 0D=A 0E=. 0F= And you had the following string: A cab. Using the given table, this would be stored in ROM as: 0D 0F 0C 0A 0B 0E - 6 bytes. Now, if we were to implement the following DTE values to the game so that are table was as follows: 0A=a 0B=b 0C=c 0D=A 0E=. 0F= 10=A 11=ca 12=b. We could then store the string as: 10 11 12 - 3 bytes; half of what the non-DTE string. ;=======================================; ;Finding the Text Routine ; ;=======================================; I'm including this here, as it's both relevant and sorely undocumented. Please note, this is my method of finding text routines in NES/Famicom games using FCEUXD and it assumes the game in question uses a typical pointer system (for more information on pointers, consult MadHacker's document on the subject). There's a few ways you can go about finding the text routine with FCEUXD; the following is the method I've had the most success with. Please note that it is not only possible, but highly probable that any two games will have distinctly different methods or reading text - some may even have compressed text. In which case, you're on your own. That's way beyond the scope of this document. Anyway, fire up the rom and bring up a string that you can get to fairly easily (I like the very first string displayed after starting a new game). Find this string in rom using your prefered method. Then, calculate and find the pointer for this string; test it to make sure. With the string still on-screen, open up the Hex Editor in FCEUXD (it should be set to NES Memory, if not change it by going to View and selecting it). What you're looking for is that pointer you just calculated. Use the Find feature to search the rom for the pointer just as it was in the rom. The results you're concerned with will be between $00 and $7FF (NES ram is mirrored, for more on this see Yoshi's NES document). More often than not, the pointer is going to be stored in the Zero Page ($00-$FF). There are exceptions to this, I'm sure, but I've never encountered any of them. Write down the address(es). Next, reset the game and open up the Trace Logger. You're going to want to log this to a file, as the log can, and probably will, get very big. Now, play through the game until the string you're working with is about to be displayed. When you're ready do whatever you need to do to get the string to start displaying. What you want to do here is start the Trace Logger so that at least one read is captured. An example is in order. Consider the following string: Reginald, I disagree. As soon as the game begins to display the 'R', we want to start the trace logger so that at least a few of the characters are captured as they're being processed. When you stop it doesn't really matter, but don't wait too long, as you'll have a beast of a log to shuffle through. Open the log file in a text editor (try to avoid Notepad as it doesn't like files that are very large; I like TextPad). Once the file is opened, search for lda (pointer_address),y or lda (pointer_address),x. Example: Say the pointer $84A9 is stored at $20-21 in ram. That is, at $20 in ram is $84 and at $21 is $A9. The pointer_address is the address in ram where the pointer is stored ($20-21 here). lda ($20),y or lda ($20),x Either one the Y or the X register could be used as the index, so search for both just to be safe. Note: If you don't understand why the above is lda ($20) and not lda ($21), look up Indirect Indexed Addressing for clarification. Now that you've found the line of code you think loads the text, you need to test it. One very simple way is to calculate the address in rom (due to mappers, the address in the trace may not be correct) and overwrite it with an NOP statement (hex equivalent = $EA), save (or do this in FCEUXD's hex editor), reload (or reset if you're using FCEUXD's hex editor) the rom, play up to the point of where the string is read and see if NOPing the statement out caused any noticable changes (this is often referred to as "breaking the rom"). If so, you might have found the routine, if not, try again. From here, you can set breakpoints and use the Trace Logger to dump the entire routine, or you can disassemble the rom and find it that away. I usually disassemble the bank the code is in and use the debugger to study the code and see how it works. That should just about cover one method of finding the text routine. If you know of another way, let me know and I'll be more than happy to include it here. ;=======================================; ;Implementing DTE in an NES game: ; ;=======================================; Now that you've located the text routine, you need to spend some time studying it; you want to understand how it works at an almost intimate level. From there you can determine what hex values are neither control codes nor actual characters. I prefer a range of unused values ($80 through $A0 instead of $80-$89 $90-A0 with $8A-$8F being used). Of course, if you have, say, only one or two unused values here and there (or even characters that aren't used at all), you can use them, but I feel a range of values makes coding and most things in general easier to manage. Anyway, once you've found that, you can begin to work on the assembly side of things. Now, you're going to want to study the game's original, unaltered text read routine until you know how everything (or almost everything) works; this can save you a lot of time debugging later. What you're looking for is a place where the byte is tested (if you're working with a Japanese game and are translating it to English, you can usually just replace the Dakuten/Handakuten code with the DTE code). Here's an example (first old and then the new): ;[old] text_read: ldy index ;Retrieve the text index from ram. lda (pointer),y ;Load the current byte of text to work with. control_code: cmp control_code ;Here we see check to see if the byte read is a bne not_cc ;control code. If it's not, we move onto the next ;bit of code that loads the tile to be displayed or ;whatever. ;[new] text_read: ldy index ;This starts off the same as the old code. lda (pointer),y dte_check: cmp first_dte_value ;Here we see if the byte is within the range of bcc control_code ;the DTE values. If it is, we move on to the DTE clc ;specific code that loads the characters. cmp last_dte_value ;If not, we branch back to where the original code beq dte_code ;went, and continue on as normal. bcs control_code ;Note: The use of a beq statement following the ;cmp last_dte_value; this is necessary to allow ;the last value to be processed. See, that wasn't too difficult at all. Usually there'll be a chunk of code cmp-ing the byte read to determine whether or not it is a dakuten/handakuten character, like this: text_read: ldy index ;Same as above examples. lda (pointer),y jap_check: cmp n ;Where n equals the hex value of the first dakuten/ bcs control_code ;handakuten character. If the byte isn't one of these ;we'll continue on and check to see if it's a control ;code. Usually, you can just overwrite this block code (jap_check in the above) with your DTE check. Unless, of course, you're working on a Japanese game, in which case, you'll need to find a place where it would be easiest to perform the DTE check. Once you've got the dte_check in and working, it's time to code the meat and potatoes of the routine - the actual dte decoding! To determine whether the first character is to be displayed or the second, a test byte is needed. So, you're going to have to find a byte of ram that isn't being used by any other part of the game. It doesn't have to be Zero Page ($00-$FF), either. After you've found out, we'll look at the code: dte_code: sec ;We subtract the start value, because the first value sbc first_dte_value ;is going to be the very first entry in the look-up clc ;table. stx unused_byte ;We want to preserve the data in the X register somewhere. ;You never know when something elsewhere is going to need ;that data. ldx test_byte ;Here we load the test_byte into X. bne second_run ;You have to make a choice here: Do you want ;test_byte = 1 to signify that you want the first ;byte or the second? I let 0 signify that I want ;the first, and 1 to signify that I want the second. ;That way I get to make use of the bne statement to ;cut down on processing time and rom space. ;Note: Using the above convention, we only have to ;test the test_byte to see if it's greater than zero. ;If it's greater than zero, we know that we don't ;want the first byte. If the test_byte isn't greater ;than zero, then the branch fails and control falls ;to the first_run routine below, which will retrieve ;the first byte of the DTE pair. first_run: ldx #$01 ;We load #$01 into X, which will be stored as the ;test_byte later. This way, we'll get the second ;byte of the encoded data. dec index ;We want to decrement the text index so that we ;read the same byte again (the DTE byte) the next ;time the text read routine is executed. Otherwise, ;we won't be able to read both bytes of the DTE. asl a ;We shift the DTE value to determine the pair number. jmp get_dte ;We need to get the byte now. second_run: ldx #$00 ;This way, we reset the test_byte so we don't ;cause any problems during future DTE reads. asl a ;Here, we get the pair number, and, since we want clc ;the second byte of the pair, we add one to the adc #$01 ;accumulator (the DTE index). Okay, so that's how the test byte works. At this point, the only thing left to do is retrieve the value we want from the look-up table. You'll need to have already generated a DTE table and added it to rom for this, though. We'll tackle that next before moving on. You have a few options when it comes to generating the DTE table: you can go through each script and count the number every possible pair of characters appear; you could could your own program to do this for you; or you could use one of the existing programs to do it for you. I would recommend coding your own program to do this, as this will allow you to add all the features (like using multiple scripts to determine the most common pairs of characters). Seeing as how not everyone's going to want or can't do this, you're best bet is using an existing program (or you could do it manually, in which case, you must really hate yourself). I only know of two programs that read a single file and generate the most common pairs of letters: DTE Crunch by Klarth and DTE Table Generator by zero soul (and I'm not even sure if DTE Crunch is publically available yet). DTE Table Generator is open source, and I've had fairly decent results with it (needed to tweak and add a few features here and there, but it got the job done). I don't really remember using DTE Crunch, so I can't vouch for how good a program it is. If Klarth's other programs are any indication, DTE Crunch should be worth your time. For further information on using these programs to generate DTE tables, consult the included documentation that comes with each program. Once the table has been generated, you need to insert it into rom. Using the "A cab." example at the beginning of this document, it would look something like this: dte_table: ;You can either insert the table manually in a .db $0D, $0F ;hex editor and write down the address to include .db $0C, $0B ;in your code, or you can tack the table in ASM .db $0A, $0E ;format as shown (.db whatever) and let your ;assembler do the work for you. If you go with the ;latter, consult your assembler's documentation to ;see if and how to define data. Now we have a DTE table, the only thing left is to write some code that can retrieve data from it. Let's do it! get_dte: stx test_byte ;Store the value of the test_byte so that we either ;get the next byte on the next run or neither. tax ;We're going to use the accumulator as the index for ;the DTE table. (That's why we performed that math ;on the DTE values.) lda dte_table,x ;Finally, we retrieve the value we want from the ;table. ldx unused_byte ;Restore the contents of the X register to what it was ;prior to the dte routine. jmp control_code ;From there, we jump back to the original code, and ;treat the decoded byte as we would any other. There you have it. Here's what it looks like altogether: text_read: ldy index ;This starts off the same as the old code. lda (pointer),y dte_check: cmp first_dte_value ;Here we see if the byte is within the range of bcc control_code ;the DTE values. If it is, we move on to the DTE clc ;specific code that loads the characters. cmp last_dte_value ;If not, we branch back to where the original code beq dte_code ;went, and continue on as normal. bcs control_code ;Note: The use of a beq statement following the ;cmp last_dte_value; this is necessary to allow ;the last value to be processed. dte_code: sec ;We subtract the start value, because the first value sbc first_dte_value ;is going to be the very first entry in the look-up clc ;table. stx unused_byte ;We want to preserve the data in the X register somewhere. ;You never know when something elsewhere is going to need ;that data. ldx test_byte ;Here we load the test_byte into X. bne second_run ;You have to make a choice here: Do you want ;test_byte = 1 to signify that you want the first ;byte or the second? I let 0 signify that I want ;the first, and 1 to signify that I want the second. ;That way I get to make use of the bne statement to ;cut down on processing time and rom space. ;Note: Using the above convention, we only have to ;test the test_byte to see if it's greater than zero. ;If it's greater than zero, we know that we don't ;want the first byte. If the test_byte isn't greater ;than zero, then the branch fails and control falls ;to the first_run routine below, which will retrieve ;the first byte of the DTE pair. first_run: ldx #$01 ;We load #$01 into X, which will be stored as the ;test_byte later. This way, we'll get the second ;byte of the encoded data. dec index ;We want to decrement the text index so that we ;read the same byte again (the DTE byte) the next ;time the text read routine is executed. Otherwise, ;we won't be able to read both bytes of the DTE. asl a ;We shift the DTE value to determine the pair number. jmp get_dte ;We need to get the byte now. second_run: ldx #$00 ;This way, we reset the test_byte so we don't ;cause any problems during future DTE reads. asl a ;Here, we get the pair number, and, since we want clc ;the second byte of the pair, we add one to the adc #$01 ;accumulator (the DTE index). get_dte: stx test_byte ;Store the value of the test_byte so that we either ;get the next byte on the next run or neither. tax ;We're going to use the accumulator as the index for ;the DTE table. (That's why we performed that math ;on the DTE values.) lda dte_table,x ;Finally, we retrieve the value we want from the ;table. ldx unused_byte ;Restore the contents of the X register to what it was ;prior to the dte routine. jmp control_code ;From there, we jump back to the original code, and ;treat the decoded byte as we would any other. dte_table: .db $0D, $0F .db $0C, $0B .db $0A, $0E And there you have it folks, Dual-Tile Encoding. Simple, no? In my experience, this is the easiest assembly modifications you can do, and is the one I personally cut my teeth on. If you have any questions, contact me at redcomet@rpgclassics.com Note: Much of the above is based on source code Gideon Zhi of Aeon Genesis (http://agtp.romhack.net/) posted on the old Romhacking.com boards, so thanks go out to Gideon for, in part, making this possible. Thanks, Gid!